23 research outputs found
The Bracketing Guidelines for the Penn Chinese Treebank (3.0)
This document describes the bracketing guidelines for the Penn Chinese Treebank Project. The goal of the project is the creation of a 100-thousand-word corpus of Mandarin Chinese text with syntactic bracketing. The Chinese Treebank has been released via the Linguistic Data Consortium (LDC) and is available to the public.
This document can be divided into six parts. Section I discusses six fundamental grammatical relations that are represented in the Treebank. Section II introduces the bracketing tagset, which includes 23 syntactic labels, 26 functional tags, and 7 tags for null elements. Section III, IV and V specify our annotation schemata for noun phrases, verbs phrases, and other minor categories, respectively. Section VI describes our treatment for empty categories, such as trace for syntactic movement, PRO for control, and pro for argument drop. Section VII and VIII cover the coordinated clauses and subordinating clauses. Section IX, X and XI specify the way we handle punctuation, ambiguity, and some problematic cases
Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching
Automatic emotion recognition is an active research topic with wide range of
applications. Due to the high manual annotation cost and inevitable label
ambiguity, the development of emotion recognition dataset is limited in both
scale and quality. Therefore, one of the key challenges is how to build
effective models with limited data resource. Previous works have explored
different approaches to tackle this challenge including data enhancement,
transfer learning, and semi-supervised learning etc. However, the weakness of
these existing approaches includes such as training instability, large
performance loss during transfer, or marginal improvement.
In this work, we propose a novel semi-supervised multi-modal emotion
recognition model based on cross-modality distribution matching, which
leverages abundant unlabeled data to enhance the model training under the
assumption that the inner emotional status is consistent at the utterance level
across modalities.
We conduct extensive experiments to evaluate the proposed model on two
benchmark datasets, IEMOCAP and MELD. The experiment results prove that the
proposed semi-supervised learning model can effectively utilize unlabeled data
and combine multi-modalities to boost the emotion recognition performance,
which outperforms other state-of-the-art approaches under the same condition.
The proposed model also achieves competitive capacity compared with existing
approaches which take advantage of additional auxiliary information such as
speaker and interaction context.Comment: 10 pages, 5 figures, to be published on ACM Multimedia 202
Topology-Aware Surface Reconstruction for Point Clouds
We present an approach to inform the reconstruction of a surface from a point
scan through topological priors. The reconstruction is based on basis functions
which are optimized to provide a good fit to the point scan while satisfying
predefined topological constraints. We optimize the parameters of a model to
obtain likelihood function over the reconstruction domain. The topological
constraints are captured by persistence diagrams which are incorporated in the
optimization algorithm promote the correct topology. The result is a novel
topology-aware technique which can: 1.) weed out topological noise from point
scans, and 2.) capture certain nuanced properties of the underlying shape which
could otherwise be lost while performing surface reconstruction. We showcase
results reconstructing shapes with multiple potential topologies, compare to
other classical surface construction techniques, and show the completion of
real scan data
Huang Shizhe : Quantification and predication in Mandarin Chinese : a case study of dou
Huang Shizhe. Huang Shizhe : Quantification and predication in Mandarin Chinese : a case study of dou. In: Cahiers de linguistique - Asie orientale, vol. 26 1, 1997. pp. 159-164
"All White Horses Carry Day-old Rabbits" - a Linguistic Examination of Algebraic Properties of Individuals and Events
Faculty research talk delivered by Shizhe Huang, Haverford College Associate Professor of Chinese & Linguistics, on November 30, 2011 at Haverford College, KINSC, Sharpless Auditorium
Huang Shizhe : Quantification and predication in Mandarin Chinese : a case study of dou
Huang Shizhe. Huang Shizhe : Quantification and predication in Mandarin Chinese : a case study of dou. In: Cahiers de linguistique - Asie orientale, vol. 26 1, 1997. pp. 159-164
Influence of the shutdown process of the driveline on the generator bearing life in the diesel generator set
Diesel generator sets are commonly used as power sources in transportation due to their versatility and cost-effectiveness. During the shutdown process, the diesel engine’s cylinder compression pressure would cause forced vibration in the driveline system through the crank linkage mechanism, resulting in unsteady loads that pose a threat to the bearing life. To address this issue, a coupling forward design method is proposed that takes into account the impact of unsteady loads on bearing life. An experiment was conducted on a 16V280 diesel generator set shutdown process, and a driveline dynamic model was established. The cumulative damage value that connects unsteady loads and bearing life was introduced to quantify the effect of unsteady loads on the bearing life during the shutdown process. The unsteady loads included torque fluctuation and collision forces. The results showed that reducing the driveline key gap and increasing the coupling stiffness can decrease the combined load on bearings and improve bearing life. A large stiffness coupling was designed to achieve shutdown smoothness and a 43.19% reduction in bearing life damage, confirming the design method’s feasibility concerning bearing life
Early Monitoring of Cotton Verticillium Wilt by Leaf Multiple “Symptom” Characteristics
Early diagnosis of cotton verticillium wilt (VW) and accurate assessment of the disease degree are important prerequisites for preventing the large-scale development of cotton VW. Hyperspectral techniques have been widely used for monitoring the extent of plant diseases, but early detection of VW disease in cotton remains a challenge. In this study, the Boruta algorithm was used to select the key physiological characteristics (leaf temperature, chlorophyll a content, and equivalent water thickness) of cotton leaves at the early stage of VW disease, and then the Relief-F algorithm was used to select the spectral features indicating multiple “symptoms” of cotton VW disease at the early stage. In addition, a new cotton VW early monitoring indicator (CVWEI) was constructed by combining the weights of the new index and related bands using a hierarchical analysis (AHP) and entropy weighting method (EWM). The study showed that the physiological indices constructed under VW stress were better indicators of VW disease than traditional vegetation indices; CVEWI achieved a high accuracy of 95% in the test set, with a Kappa coefficient of 0.89; and the test set R2 was 0.73 and RMSE was 3.15% for monitoring disease severity, compared to the optimal classification constructed using a single spectral index. The results may provide new ideas and methods for early and accurate monitoring of VW and other fungal diseases
Direct Experimental Evidence of Biomimetic Surfaces with Chemical Modifications Interfering with Adhesive Protein Adsorption
Current approaches to dealing with the worldwide problem of marine biofouling are to impart chemical functionality to the surface or utilize microtopography inspired by nature. Previous reports have shown that only introducing a single method may not resist adhesion of mussels or inhibit biofouling in static forms. While it is promising to integrate two methods to develop an effective antifouling strategy, related basic research is still lacking. Here, we have fabricated engineered shark skin surfaces with different feature heights and terminated with different chemical moieties. Atomic force microscopy (AFM) with a modified colloid probe technique and quartz crystal microbalance with a dissipation n (QCM-D) monitoring method have been introduced to directly determine the interactions between adhesive proteins and functionalized surfaces. Our results indicate that the adhesion strength of probe-surface decreases with increasing feature height, and it also decreases from bare Si surface to alkyl and hydroxyl modification, which is attributed to different contact area domains and interaction mechanisms. Combining biomimetic microtopography and surface chemistry, our study provides a new perspective for designing and developing underwater anti-fouling materials
Developing Guidelines and Ensuring Consistency for Chinese Text Annotation
With growing interest in Chinese Language Processing, numerous NLP tools (e.g. word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the world. However, since no large-scale bracketed corpora are available to the public, these tools are trained on the corpora with different segmentation criteria, part-of-speech tagsets and bracketing guidelines, and therefore, comparisons are difficult. As a first step towards addressing this issue, we have been preparing a 100-thousand-word bracketed corpus since late 1998 and plan to release it to the public summer 2000. In this paper, we will address several challenges in building the corpus, namely, creating annotation guidelines, ensuring annotation accuracy and maintaining a high level of community involvement. 1. Introduction With growing interest in Chinese Language Processing, numerous NLP tools (e.g. word segmenters, part-of-speech taggers, and parsers) for Chinese have been developed all over the wo..